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Abstract 

Maximum satisfiability is a canonical NP-hard optimization problem that appears empirically hard 
for random instances. In particular, its apparent hardness on random fc-CNF formulas of certain densities 
was recently suggested by Feige as a starting point for studying inapproximability. At the same time, it 
is rapidly becoming a canonical problem for statistical physics. In both of these realms, evaluating new 
ideas relies crucially on knowing the maximum number of clauses one can typically satisfy in a random 
fc-CNF formula. In this paper we give asymptotically tight estimates for this quantity. Specifically, let us 
say that a fc-CNF is p-satisfiahle if there exists a truth assignment satisfying 1 — 2^^ -\-p2^^ of all clauses 
(observe that every fc-CNF is 0-satisfiable). Also, let Ft;{n,m) denote a random fc-CNF on n variables 
formed by selecting uniformly and independently m out of all 2*^ ('^) possible fc-clauses. 

Let r(p) = 2*^ In 2/(p + (1 — p) ln(l — p)). It is easy to prove that for every k > 2 and every p G (0, 1], 
if r > r(p) then the probability that Fk{n, rn) is p-satisfiable tends to as n ^ oo. We prove that there 
exists a sequence (5^ — > such that if r < (1 — (5fe)T(p) then the probability that Fk{n, rn) is p-satisfiable 
tends to 1 as n ^ oo. The sequence 5k tends to exponentially fast in k. Indeed, even for moderate 
values of k, e.g. k — 10, our result gives very tight bounds for the number of satisfiable clauses in a 
random fc-CNF. In particular, for fc > 2 it improves upon all previously known such bounds. 



*Part of this work was done while visiting UC Berkeley. 

t Research supported by NSF Grant DMS-0104073 and a Miller Professorship at UC Berkeley. 
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1 Introduction 



Given a Boolean CNF formula F, the Satisfiability problem is to determine whether there exists a truth 
assignment that satisfies F. When F has exactly k literals in each clause, Satisfiability is known as fc-SAT 
and is NP-complete |Coo7f j for all fc > 3. A natural generalization of satisfiability is determining whether 
there exists a truth assignment that satisfies a given number of clauses in F. For fc-CNF this problem is 
known as Max k-SAT and is NP-complete for aU fc > 2 (see IGJTQj l. 

Optimization problems with random inputs are pervasive in operations research (e.g., the travelling 
salesman problem and variants), in statistical physics (determining ground states of spin glasses) and in 
computer science. An interesting source of Max fc-SAT instances comes from considering fc-CNF chosen 
uniformly at random (see below). Historically, the motivation for studying such formulas has been the desire 
to understand the hardness of "typical" instances. Random fc-CNF are by now the most studied generative 
model for random formulas and have been a very popular benchmark for testing and tuning satisfiability 
algorithms. In fact, some of the better practical ideas in use today come from insights gained by studying 
the performance of algorithms on random fc-CNF |ijLM5^1 IC^CKOO| . 

A natural starting point for considering Max fc-SAT is the observation that for every fc-CNF formula 
there exists a truth assignment satisfying at least (1 — 2~^) of all clauses. Indeed, if such a formula has m 
clauses, the average over all 2" truth assignments of the number of satisfied clauses is precisely (1 — 2^'^)m. 
With this in mind, we will say that a fc-CNF formula is p-satisfiable, where p G [0, 1], if there exists a truth 
assignment satisfying 1 — 2"*^ -I- p2~'' of all clauses. 

To consider random fc-CNF formulas, let Ck denote the set of all (2?i)'^ possible disjunctions of fc literals 
on some canonical set of n Boolean variables. To form a random fc-CNF formula Fk{n,m) with m clauses 
we select uniformly, independently and with replacement m clauses from Ck and take their conjunction^. 
We will say that a sequence of random events occurs with high probability (w.h.p.) if lim„^oc Pi'[^^n] ~ 1 
and with uniformly positive probability if liminf„^oo Pi'[^^n] > 0. We emphasize that throughout the paper 
fc is arbitrarily large but fixed, while n — > oo. For every fc > 2 and p € (0, 1], let 

rk{p) = sup{r : i^/j(n, rn) is p-satisfiable w.h.p.} 

< inf {r : i^/j(n, rn) is noi p-satisfiable w.h.p.} = r^(p) . 

One of the most intriguing aspects of random formulas is the Satisfiability Threshold Conjecture which 
asserts that ^^(l) = r^(l) for every fc > 3. Much work has been done to bound rk{l) and ^'^(l). Currently, 
the best rigorous bounds for general fc > 3, from |AP03I II)B97j respectively, are: 2'^ In 2 — 0{k) < r^ < r'l < 
2'^ln2 — 0(1). For p < 1, the bounds for rk{p),r1{p) were much further apart. 

The state of the art for general fc was presented in an important recent paper by Coppersmith. Gamarnik, 
Hajiaghayi, and Sorkin |CGHS03j . where it was proved (see JH)) for a more precise formulation) that there 
exists an absolute constant c > such that for all fc and all p G (O,po(fc)], 

c 2'=+Mn2 , , , 2'=+iln2 

-k^^^^'^P^^'^^P^^ pHi + oii)) ■ W 

The upper bound in |^ was proved via the first moment method, while the lower bound is algorithmic. 
For small fc the two are reasonably close, but the ratio between them tends to infinity as fc grows; this 
naturally raises the question which bound is closer to the truth. Our main result resolves this question by 
pinpointing the values of rfc(p) and r^,(p) with relative error that tends to zero exponentially fast in fc. For 
every p S (0, 1) denote 

r.(p) - ^''^^^^ (2) 

p + (1 - p) ln(l - p) ' 
and let Tfc(l) = 2'^ ln2 so that Tfc(-) is continuous on (0, 1]. 

^Our discussion and results hold in all common models for random fc-CNF, e.g. when clause replacement is not allowed 
and/or when each fc-clause is formed by selecting k distinct, non-complementary literals with/without ordering. The model 
defined here is best suited for our calculations. Wc further comment on its relationship to other models in the end of Section 1^ 
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Theorem 1. There exists a sequence Sk = 0{k2 ^/'^), such that for all k > 2 and p G (0, 1], 

( 1 - 4 ) Tfc (p) < rk ip) < rl (p) < Tk (p) . (3) 
The upper bound in (jSJ follows from well-known tail estimates. Taylor expansion gives that as p — > 0, 

2'' In 2 



Tkip) = 



p2/2 + 0(p3 



so as p ^ 0, we can sharpen |^ to 



(1 



2''^+! In 2 

^2 + 0(^3) 



< Tkip) < rl{p) < 



2'"+^ In 2 
p2 + 0(p3) 



(4) 



Our proof of Theorem actually yields an explicit lower bound for rk{p) for each k > 2. For k = 2, 
i.e. Max 2-SAT. the algorithm presented in |£IGHS03 dominates our lower bound uniformly, i.e. for every 
density it satisfies a greater fraction of all clauses. Already for k > 3, though, our methods yield a better 
bound, as indicated by the following plots. 
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Figure 1. Upper and lower bounds for the density r as a function of q ~ 1 — p. 

Our approach in proving Theorem ^ is non-algorithmic, based instead on a delicate application of the 
second moment method to a random generating function in two variables. It is notoriously difficult to obtain 
precise asymptotics from such random multivariable generating functions; the fact that this is possible for 
random Max fc-SAT is technically due to the surprising cancellation of four terms of equal magnitude in our 
analysis, leaving only lower order terms. This cancellation hints at the existence of some unexpected hidden 
structure in random Max fc-SAT; characterizing this structure combinatorially (rather than just analytically) 
appears to us worthy of further study. 
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1.1 Background 

For a random formula Fi^{n,m), denote by Sk{n,m) the random variable equal to the maximum (over all 
truth assignments cr) of the number of clauses satisfied by a. Perhaps the first rigorous study of random 
Max fc-SAT appeared in the work of Frieze, Broder and Upfal |BFU.93j where it was shown that Sk{n,m) is 
sharply concentrated around its mean. Specifically, 



Theorem 2 r| BFU93 |^. Pr 



|sfc(ri,m) - E[sfe(n,m)]| >t 



< 2cxp(-2tV"^)- 



The following corollary allows us to infer high probability results from positive probability results. 

Corollary 1. If Fk{n, rn) is pQ-satisfiable with uniformly positive probability, then for every constant p < pq, 
Fk{n,rn) is p-satisfiable w.h.p. 

Proof. Let S = {1 — 2^^ + po2^'')rn. Since Fk{n,rn) is po-satisfiablc with uniformly positive probability, 
E[sfc(n, rn)] > S — r?!'^ . For. otherwise. Theorem |5] would imply that the probability of po-satisfiability is 
exponentially small. By the same token, Pr[sfc(r7,, rn) < S — 2r?l'^\ = o(l), implying the claim. □ 

Thus it will suffice to find, for every p G (0, 1], a value r = r(p) such that Fk{n,rn) is p-satisfiable with 
uniformly positive probability and rely on Corollary ^ to get a high probability result. 

Regarding the mean in Theorem[21 in view of the a priori bound Sk{n, m) > (1 — 2~^)m, it is natural to 
consider <I>fc(n,TO) = E[sfe(n,TO)] — (1 — 2~^)m, measuring how much the optimum truth assignment does 
better than the a priori bound in expectation (over random fc-CNF). In |CGHS03] it was shown that for all 
fc, for sufficiently large r, as n ^ oo one has in Fk{n, rn) 



k ^ ^, s <^k(n,rn) /(2'=-l)ln2 _ 
This is equivalent to the assertion that for p sufficiently small, 

1.2^+2 

^^^-j-^ X p-2 - O(p-i) < r,{p) < rlip) < 2(2'= - l)ln2 x p-' , (6) 

which is a more precise formulation of 

Since for fc = 2, the threshold for satisfiability is known, namely r2(l) = r2(l)=l; in |CGHS03] very 
fine results were derived for Sk{n,rn) when r w 1. In particular, when r = 1 + e one has E[s2(ri, m)] = 
(1 + e — 0{e^))n, while for large r > 1 the bound in jsj can be improved to 



VS-1 r- ^ $2(n,™) /31n2 _ 

3v7i' n V 8 

Another intriguing aspect of random fc-CNF formulas is their proof complexity. In a seminal paper, 
Chvatal and Szemcrcdi |CS88j proved that for all fc > 3 and r > 2'=ln2 there exists e = e(r) such that 
w.h.p. every resolution refutation of Fk(n, rn) contains at least (1 -f- e)" clauses. Since then there have been 
a number of extensions of this result |BP96I IBKPS02] and it is widely believed that random fc-CNF are hard 
for much stronger proof systems than resolution. Indeed, recently, Feige |Fei02j showed that a hypothesis 
asserting that proving unsatisfiability of random fc-CNF with r ^ 2^ In 2 is hard, implies a number of strong 
inapproximability results. A closely related hypothesis is that approximating Max fc-SAT for such formulas 
is also hard for all fc > 2. Recent work by Fernandez De la Vega and Karpinski |FdlVK02] proves that one 
can approximate Max 3-SAT on i^3(n,rn) within 9/8 which is better than the trivial 8/7 bound. 
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2 Outline 



2.1 Understanding correlation sources in MAX k-SAT 

The following easy consequence of the Cauchy-Schwarz inequality underlies the second moment method. 
Lemma 1. For any non-negative random variable X, 

p'-t^ > ^ Im ■ 

Thus, for any fixed p € (0, 1] one can let X denote the number of p-satisfying assignments and apply (^3) 
to bound Pt[X > 0] from below. Unfortunately, it turns out that for every r > 0, there exists a constant 
(3 = (3{k,r) > 1 such that E[X^] > /?"E[X]^. As a result, this straightforward approach only gives a trivial 
lower bound on the probability of p-satisfiability. 

In |AP03| , it was shown that in the case p = 1 a. major factor in the excessive correlations behind the above 
failure is the following form of populism: leaning toward the majority vote truth assignment. To see this, first 
observe that truth assignments that satisfy more literal occurrences than average, have higher probability 
of being satisfying. At the same time, in order to satisfy many literal occurrences such assignments tend to 
agree with each other (and the majority truth assignment) on more than half the variables. As a result, the 
successes of such assignments tend to be highly correlated, thus dominating E[X^]. In order to avoid this 
pitfall, we would like, as in |AP03j . to apply the second moment method to truth assignments that satisfy, 
approximately, half of all literal occurrences; we call such truth assignments "balanced". In the context of 
p-satisfiability, however, there are new obstacles to overcome before obtaining a lower bound for rk{p) that 
asymptotically matches the upper bound. To capture the behavior of balanced truth assignments we begin 
by defining two "fitness" gauges. 

Given any /c-CNF formula on ti variables and any truth assignment a E {0, 1}" let 

1. H = H{a,F) be the number of satisfied literal occurrences in F under cr, minus the number of 
unsatisfied literal occurrences in F under tr. 

2. U = U{a,F) be the number of unsatisfied clauses in F under a. 

We would like to focus on truth assignments that are balanced and p-satisfying, up to fluctuations one 
would expect from a central limit theorem, i.e., truth assignments cr such that 

\H{a,F)\<AV^ (8) 
\U{a, F){1 - p)2-''m\ < . (9) 

To do this let us write uq = {1 — p)2^'^ and fix some j,ri < 1. Now, for a random fc-CNF formula F, 
consider the weighted sum F 

a 

Since 7,77 < 1 we see that in X the truth assignments cr for which H{a,F) > or U{a-,F) > u^m are 
suppressed exponentially, whereas the rest are rewarded exponentially. Decreasing 7, 77 g [0,1) makes this 
phenomenon more and more acute, with the limiting case 7, 77 = corresponding to a 0-1 weighting scheme 
(we adopt the convention O'' = 1). Indeed, applying the second moment method to X with 77 = corresponds 
to the approach of |AP03j for the random fc-SAT threshold, where only satisfying assignments receive non- 
zero weight 'y^f'^'^). A key step in our analysis, presented in Subsection l2.3l is the tuning of the parameters 
7, 77 to focus on truth assignments a for which © and jnj hold. Before doing that, we establish the upper 
bound in Theorem ^ 
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2.2 The upper bound in Theorem [T] 

This upper bound can be readily established by using the entropic-form Chernoff bound for the Binomial 
(see Lemma A. 10 in |AS91| or Lemma 3.8 in jDM95| ). but it is more informative to give a self-contained 
argument. Recall the definition of rfe(-) from 



Lemma 2. For all k > 2 and p e (0, 1], if q ^ I — p th 



en 



'■Up) < ^'^''^ . , . < n{p) . (10) 

glng-(2fc-g)ln(|^) 



2"-! 



Proof. The right hand inequality of H10|l follows from the inequality Int < t — 1 applied to i — ^^^g , 
just need to verify the left hand inequality. To do that, write uq = Let rj G (0, 1), and observe that if 
F is p-satisfiable, then U{a, F) < uom for some a, whence 

a 

From (|20|1 in the next section we have that 

-k / \ "^^^ 

P[^(l,»7)>l]<E[X(l,,7)]=2",7-«™2- ^_(^_,^)2-M . (11) 



Thus, the probability of p-satisfiability decays exponentially in n if the the n-th root of the RHS of Hll|l is 
strictly smaller than 1. Taking 77 = q{2^ — l)/(2'^ — q) yields the lemma. □ 



2.3 Tuning parameters and truncation 

When ?7 > 0, attempting to apply the second moment method to X we encounter two major problems. 

The first problem is that while X > Q implies satisfiability when ry = 0, when ry > having X > Q does 
not imply pg-satisfiability: in principle, X could be positive due to the contribution of assignments falsifying 
many more clauses than Mgrn. This necessitates restricting the sum defining X to truth assignments falsifying 
at most UqTO + 0[^/rn) clauses, i.e. truncating X. 

The second, more severe, problem is that with or without this truncation, E[Ar]^/E[Ar^] becomes expo- 
nentially small when r is only, roughly, half the (asymptotically optimal) lower bound of Theorem^ Rather 
counterintuitively, we will be able to delay this explosion until r is within 1 — o(l) of the upper bound by 
also removing from the sum those "heroic" truth assignments falsifying fewer than UQ-m clauses. This affords 
us much tighter control of pairs of assignments that agree on nearly all variables, which turn out to be the 
dominant contributors to E[A'^] as we approach the upper bound. The idea behind this sacrifice is motivated 
by Cramer's classical "change of measure" technique in large deviation theory. The corresponding "adaptive 
weighting" scheme requires an extremely sharp asymptotic analysis, involving a number of rather miraculous 
cancellations. Due to space limitations this analysis appears entirely in the Appendix. 

Specifically, for some fixed yl > let 

5* = {cr e {0, 1}" : H{a, F) > and U{a, F) G [uom, uqiti + A^f^} . 
Moreover, given uq, let 70, ryo be defined by 

1-% = (l-7o)(l+7o)'-' 

(12) 

^ % 

These two equations are designed so that the main contribution in the sum defining X comes from truth 
assignments for which (jSJ and (jSJ holds. The connection is made in equations H25() and H26|l in Sectional 



6 



We define 



^* — 10 '/o 



Note that, by definition, when > at least one truth assignment must falsify at most UQin + Ay^ clauses. 
Thus, if for a given po we can prove that there exists a constant D > such that E[X^] < D x E[X*]^ then, 
by Corollary^ it follows that Fk{n,rn) is w.h.p. p-satisfiable for all p < po. 

Bounding the second moment of X* will be accomplished in the following lemmata. For a E [0, 1], let 



and 



a 



(13) 



9r{an,v)= f.^^'^'^lL ■ (14) 



In all of the following lemmata fc > 2 is a fixed integer and ?■ > 0. 

• Lcmma|3with 7 = 70 and rj = 770 gives us E[X(7o, rjo)]'^ which is E[X*]^ but for the truncation. 

• Lemma0]asserts that for every value of mq, E[X*] is a constant fraction of E[X(7o, 770)] . Thus, combined 
with Lemma 01 it gives us E[X*]^ up to a constant factor (which is all we need). 

• Lemma 13 expresses E[X^] as a sum with n + 1 terms, the z-th term capturing the contribution of the 
2"(") pairs of truth assignments with overlap z. The contribution of each such pair is then bounded 
by f{z/n, 7, 77)''" where 7, rj are allowed to depend on z, subject only to 7 > 70 and 77 > 770 respectively. 
In other words. Lemma |5l allows us to adapt 7 and 77 to a, which is crucial when p < 1. 

• Lemma|Hlis based on the fact that for any "smooth" choice of sequences 7(2), ?7(z), the sum in Lemma|5l 
will be dominated by the contribution of the Q{n^/^) terms around the maximum term. Specifically, if 
X, u) express our adaptive scheme for 7, 77, then we can use the Laplace method to get that the maximum 
of (7r(«7 x(q^)7 '^(ct)) over a G (0,1), characterizes the sum in LemmaUlup to a constant factor. 

Lemma 3. For every 7x0,7,77 £ [0,1), 

2.9.(1/2,7,77)) . 

Lemma 4. For every uq, there exists 9 = 9(k, ^) > such that as n — > 00, 

E[X(7o,?7o)] 

Lemma 5. Let 'y{z),r]{z) be arbitrary sequences such that 7(2) > 70 and 77(0) > 77(0) for every < z < n. 
Then, for every uq, 

n / \ 



2=0 



E[X2]<2"^f j/(z/7^,7(z),77(z)) 



Lemma 6. Let x ■ [0,1] — > [70jl) o,''^d lu : [0,1] [770,1) be arbitrary piecewise- smooth functions and let 
gr{a) = gr{a, x(a), uj{a)). If there exists amax e (0, 1) such that ,9r(amax) = ,9max > ffr(a) for all a ^ a^ax, 
and (7"(amax) < 0, then there exists a constant D = D^^^{k, r, Tio) > such that for all sufficiently large n 

E[X^,]<Dx (2.g,„ 
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Combining Lemmata EHHl we see that if for a given uq and r there exist Xt'-^ such that for all a 7^ 1/2 



5r(l/2,7o,7?o) > gr{a,x{a),uj{a)^ (15) 

then E[X2] < X E[X*]2, yielding the desired conclusion E[X^] = 0(E[X*]2). 

Indeed, to prove Theorem^we will show that for every p S (0, 1] and for the stated r = there exist 
functions for which p5|) holds. To simplify the asymptotic analysis, we use the crudest possible such 
functions, paying the price of this simplicity in the value of kg in Proposition below. We note that by 
choosing a more refined (and more cumbersome) adaptation of 7, 77 to a this value can be improved greatly. 
Moreover, we emphasize that for any fixed value of fc, one can get a sharper lower bound (such as those 
reported in the Introduction) by partitioning [0, 1] to a large number of intervals and numerically finding a 
good value of 7, 77 for each one. We discuss this point further in Sectional Finally, we note that general large 
deviations considerations imply that for every k and p, the condition H15|) is sharp for our method. That is, 
no better lower bound can be derived by considering balanced assignments and, in fact, by any argument 
that classifies assignments according to their number of satisfied literal occurrences in the formula. 

Definition 1. Let q = 1 — po = uq2^ and let 



(l - 2Qk2-''f'^A where ip{q) = -■ . (16) 

V / 1 — q + qlnq 



1 - q + qlnq \ I 1-g + glnq 

Theorem^will follow from the following Proposition. 
Proposition 7. Let 



g^(a,7o,77o) «/" e [^,1 



3 In fc 1 



GM) = { (17) 
Qr (a, Vto , \A7o ) otherwise. 

For all k > ko, if r < tk then G;'(1/2) < and G^(l/2) > G^(a) for all a ^ 1/2. 



The proof of Proposition [3 itself, will be decomposed into three lemmata of increasing difficulty. The 
first lemma holds for any 7,77 and reduces the proof to the case a > 1/2. The second lemma rcfiects the 
behavior of / (and thus gr) around a ~ 1/2, motivating the judicious choice 77 = r/o and 7 = 70 for Gr- 
The third lemma deals with a near 1. That case needs a lot more work in order to handle the unique local 
maximum of gr in that region. The condition r < tk and the change to 7 = ^/to, ?/ = aims precisely at 
keeping the value of gr at this other local maximum smaller than (7,.(l/2, 70, 770)- 

Lemma 8. For every < x < 5, Gr(l/2 + x) > Gr(l/2 - a;). 

Lemma 9. For all k > kg, if r < i_^^*^^gf„g then G"(l/2) < and Gr is strictly decreasing on [i, 1 — '^^f^] ■ 
Lemma 10. For all k>ka, if r < tk then for every a € [l - ^^,l], Gr(l/2) > Gr{a). 

In the following sections we prove Lemmatal^HHl while Lemmata!^ llOl are proven in the appendix. Before 
delving into the probabilistic calculations involved in proving Lemmata 13 El *i couple of remarks are in order. 

Relationship to other fc-CNF models: Recall that the m clauses oi Fk{n,m) are chosen independently 
with replacement among the (2n)'^ possibilities. Thus, the m clauses {q}™ are i.i.d. random variables, each 
Ci being the conjunction of k i.i.d. random variables each £ij being a uniformly random literal. This 

viewpoint of the formula as a sequence of km i.i.d. random literals will be very handy for our calculations. 

Clearly, in this model some clauses might be improper, i.e. they might contain repeated and/or contra- 
dictory literals. At the same time, though, observe that the probability that any given clause is improper is 
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smaller than fc^/n and, moreover, the proper clauses are uniformly selected among all such clauses. Therefore 
w.h.p. the number of improper clauses is o(n) implying that if for a given r, Fk{n, rn) is p-satisfiable w.h.p. 
then for m = rn — o{n), the same is true in the model where we only select among proper clauses. The 
issue of selecting clauses without replacement is completely analogous as w.h.p. there are o{n) clauses that 
contain the same k variables as some other clause. 

Notation: In the ensuing probabilistic calculations it will be convenient to write cr ^ F to denote that the 
truth assignment a violates the formula F where F can be a literal, a clause, or an entire CNF. 

3 The first moment and proof of Lemma [HI 

By linearity of expectation and since the m = rn clauses ci, C2, . . . , Cm are chosen independently we have 



Observe now that since the clauses are identically distributed, by symmetry, it suffices to consider the 
expectation in H18|l for a single random clause c = V • • • V and a fixed truth assignment a. Moreover, 
observe that if we write ^^rf as 7^ + "f^ {ri^ — 1) we see that the second expression is non-zero only when 
[/ > 0, i.e. when c is violated by a. So, since the literals £1, . . . ,£k are i.i.d. we get 





(18) 



E 7- 




E []7«(->^^) 



2-V'(l-'?) 





(19) 



Thus, 



E[X] =7,-""™ 2"Z(7,77) 



rn 



(20) 



Observe now that 



(rr""^(7,'/))'-/(l/2,7,r?) 



Therefore, 



— UQrn 



2"Z(7,ry)"f = 2 Z(7,77)0T = [4/(1/2,7,77)'-]" = [23,(1/2,7,77)] 



in 
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4 Proof of Lemma |H 



By linearity of expectation, it suffices to prove that there exists some 9 — 9{k, A) > such that for the 
values of 70, rjo satisfying (|12|l and every truth assignment a, we have 



(21) 



Recalling that formulas in our model are sequences of i.i.d. random literals £1, . . . , iknn let P(-) denote the 
probability assigned by our distribution to any such sequence, i.e. (271)^*^™. Now, fix any truth assignment 
(7 and consider an auxiliary distribution on fc-CNF formulas where the m clauses ci, . . . ,Cm are again 
i.i.d. among all (2n)'^ clauses, but where now for any fixed clause uj 



E 


7o 


^,F) U{a.F) 

Vo ^aes 




E 


H(a.F) U{a.F)' 

7o 





where 



7o ?fo P(^) 



E 



H(I7,C) U(<T,C)' 

7o Vo 



(22) 
(23) 



was defined in (|19() . (Since each fixed clause lu receives probability proportional to indeed 
^(70,770) provides the correct normalization to a probability distribution.) So, whereas under P(-) every 
fc-CNF formula F with m clauses had the same probabihty P{F) = (2n)~'^"', under P^- its probability is 



7o Vo P(^) 



(24) 



^(7o,?/o)™ 

Let Ecr be the expectation operator corresponding to P^,. A calculation similar to that leading to H19|l . 
adding the equal contributions from the k literals, gives that for a single random clause c 



^(7o,?/o)ECT[i?(cr, c)] = fc(7o - 7o 



""+^""^'"+fc(27o)-(l-.o) 



Moreover, 



(25) 
(26) 



Z{jo,Vo)EAU{<J,c)] = (27o)So. 
Thus (fT^ ensures that 'Ea-[H{(7,c)] = and also that 'Ea-[U{a,c) — uq] = 0. 

Next, we apply the multivariate central limit theorem (see, e.g. jPol02j . page 182) to the i.i.d. mean-zero 
random vectors (^H{a, Ci), U{(j, a) — uo^ for i ~ 1, . . . , m. Observe that, since fc > 2, the common law of 
these random vectors is not supported on a line. We deduce that as 71 —> 00 



H{(7, F) > and U{a, F) e [muo, muo + Ay/m] -> 9{k, A) > 



Here, the right hand side is the probability that a certain nondegenerate bivariate normal law assigns to a 
certain open set. Its exact value is unimportant for our purpose. By (|24|l . this is equivalent to 121|) . 



5 Proof of Lemma [5] 

Linearity of expectation implies 



^2uom j,j^2j 




H{a^F) U{a,F) 

Vo 



F)+H{t,F} U{a,F) + U{T,F} , 

Vo i-a,TeS*{F) 



(27) 
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Observe now that since a E S* implies H(a, F) > and U{(7, F) > UQm, we get that for every pair a, r and 
any 7 > 70 and t] > 770, 



E 



H(a^F)+H{T,F} U{a,F) + U{T,F) , 
7o ^70 i-a^reS'iF) 



< E 

< E 



H{a,F)+H(T,F) U{a,F) + U{T,F) , 
; '/ ^<t,tgS*(F) 



^H{a.F)+H(T,F)^U{a.F) + U{T,F) 



(28) 



In other words, when using the right hand side of (|28|) to bound each term of the sum in H27|) . we are allowed 
to adapt the value of 7 and rj to the pair a, r, the only restrictions being 7 > 70 and i] > r]o. This is a crucial 
point and we will exploit it heavily when bounding the contribution of pairs with large overlap. 

To estimate the right hand side of (|28|l for any pair cr, r we first observe that since the m clauses 
Ci, C2, . . . , c„i arc i.i.d., letting c be a single random clause we have 



E 



^H{a,F)+H{T,F}^U{a,F) + U{T,F) 



E 



Y^^H{a,c.)+H{T,cO^U{a,c.)+U{T,c,) 



^^H{a,c) + H{T, 



c)^U(a,c) + U{T,c) 



(29) 



Next, we observe that for every pair cr, r, by symmetry, the expectation in (|29|l depends only on the 
number of variables to which a, r assign the same value. So, let cr, r be any pair of truth assignments that 
agree on exactly z = an variables, i.e. have overlap z. By first rewriting (again) 7^77^ as 7^ + "f^ [ff — 1) 
and then observing that 7y'^('^''=) is distributed identically with rf^'^"^'^^ we get 



E 



^H[a.c)^H{a,c)^H[r,c)^H[r,c) 



7 



2E 



77 



H(a,c)+H( 



7 



7 



77 



2(1 - 77)E 



+ 2-'=aS""'(l-??)' ■ (30) 



H{a,c}+H{T,c)-, 



Now, to estimate (|30|) we note that since the literals £1, ^2, ■ • ■ ^fe comprising c are i.i.d. wc have 



E 



H{a,c) + H{T,c) 



E 



J|E \jH{a,l,)+H{r,> 



a I ' \+l-a 



and, similarly, 



E 



H{a.c)+H{r,c)-. ' 



E 



n„,H{a.e,)+H{T,ei)-, 



^ + (1 — a) 



Substituting these last two equations in H30() we get 



^H{a,c)^H{a,c)^H{T,c)^H{T,c) 



-,— 2iio 



= /(a, 7, 77) 



7' + 7^' 



1-a -2(1-77) 



aj ^ + (1 — a) 



(31) 
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So, in conclusion, since the number of ordered pairs with overlap z is 2" (") we get that 

nxl] < 2" f{z/n, j{z), 7^{z)y" , (32) 



z=Q 



for any set of choices for j{z),r]{z) such that 7(2) > 70 and 77(2) > 770 for all < z < n. 
5.1 Proof of Lemma m 

If X ■ [0, 1] [70, 1] and uj : [0, 1] [70, 1] are piecewise smooth, then from the definition of / we see that 
/(a, x(c«), w(q;)) is also piecewise smooth. Thus, we can decompose the sum in H32|l into a fixed number of 
sums such that /(a, x(a), w(q;)) is smooth in the range of each sum. To bound each such sum, then, we use 
the following lemma whose proof is implied by the proof of Lemma 2 in |AM02j (that lemma is stated with 
the requirement that / is analytic, a condition not needed for the proof; in fact, it suffices for / to only be 
twice differentiable.) The idea is that each of these sums is dominated by the contribution of 8(71^/^) terms 
around the maximum term. Since the number of sums is finite the lemma follows. 

Lemma 11. Let (f> be any real, positive, twice- differentiable function on [0, 1] and let 

z=0 

Letting 0'^ = 1, define g on [0, 1] as 

9W = -^TTl ■ 

a" (1 — a)^ " 

// there exists ctmax G (0, 1) such that ,9(Q!max) = ffmax > <?(a) for all a ^ amax, tf^c? (7"(aniax) < 0, then 
there exist constants B,C' > such that for all sufficiently large n 

B X ,9max < Sn < C X g^^^ . 



6 Bounds for finite k 

As mentioned in Section |21 for small values of k the simple adaptation scheme of Proposition [3 does not 
yield the best possible lower bound for p-satisfiability afforded by our method. For that, one has to use a 
significantly more refined adaptation of 7, ?/ with respect to a. Our lower bounds reported in Figure 1 are, 
indeed, the result of performing such optimization of 7, rj numerically (for both the upper bound plots and 
the plots of the lower bound from |CGHS03] we used the explicit formulas). 

Specifically, to create the plots of the lower bounds we computed a lower bound for 100 equally spaced 
values of p on the horizontal axis (and then had Maple's jr{,ed94j plotting function "connect the dots"). 
For each of these values of p, to prove the corresponding lower bound for r we had to establish that there 
exist a choice of functions as in Lemma IHl such that for all a G (1/2,1] we have 5, (1/2,70,770) > 
(7r(a, x(a), <j-'(a))- To that end, we partitioned (1/2, 1] to 10,000 points and for each such point we searched 
for values of 7 > 70 and 77 > 770 such that this condition holds with a bit of room. (For fc > 4 we solved H12|l . 
defining 70 and 770, numerically to 10 digits of accuracy. For the optimization we exploited convexity to speed 
up the search.) Having determined such values, we (implicitly) extended the functions to all (1/2, 1] by 
assigning to every not-chosen point the value at the nearest chosen point. Finally, we computed a (crude) 
upper bound on the derivative of g^ with respect to a in (1/2, 1]. This bound on the derivative, along with 
our room factor, then implied that for every point that we did not check, the value of was sufficiently 
close to its value at the corresponding chosen point to also be dominated by (7^(1/2, 70, t^o)- 



12 



Acknowledgements 



Wc thank Cris Moore for helpful conversations in the early stages of this work. 



References 

[AM02] Dimitris Achlioptas and Cristopher Moore, The asymptotic order of the random k-SAT threshold, 
43th Annual Symposium on Foundations of Computer Science (Vancouver, BC, 2002), IEEE 
Comput. Soc. Press, Los Alamitos, CA, 2002, pp. 779-788. 

[AP03] Dimitris Achlioptas and Yuval Peres, The random k-SAT threshold is 2*^ ln2 — 0(A:), 35th Annual 
ACM Symposium on Theory of Computing (San Diego, CA), 2003, to appear. 

[AS91] Noga Alon and Joel H. Spencer, The Probabilistic Method, Wiley 1991. 

[BFU93] Andrei Z. Broder, Alan M. Frieze, and Eli Upfal, On the satisfiability and maximum satisfiability 
of random 3- CNF formulas, Proc. 4th Annual ACM-SIAM Symposium on Discrete Algorithms, 
1993, pp. 322-330. 

[BKPS02] Paul Beame, Richard Karp, Toniann Pitassi, and Michael Saks, The efficiency of resolution and 
davis-putnam procedures, SIAM J. Comput. 31 (2002), no. 4, 1048-1075. 

[BP96] Paul W. Beame and Toniann Pitassi, Simplified and improved resolution lower bounds. Proceed- 
ings 37th Annual Symposium on Foundations of Computer Science (Burlington, VT), IEEE, 
October 1996, pp. 274-282. 

[CGHS03] Don Coppersmith, David Gamarnik, Mohammad T. Hajiaghayi, and Gregory B. Sorkin, Random 
MAX 2-SAT and MAX CUT, 14th Annual ACM-SIAM Symposium on Discrete Algorithms 
(Baltimore, MD, 2003), ACM, New York, 2003. 

[Coo71] Stephen A. Cook, The complexity of theorem-proving procedures, 3rd Annual ACM Symposium 
on Theory of Computing (Shaker Heights, OH, 1971), ACM, New York, 1971, pp. 151-158. 

[CS88] Vasek Chvatal and Endre Szemeredi, Many hard examples for resolution, J. Assoc. Comput. 
Mach. 35 (1988), no. 4, 759-768. 

[DM95] Paul Deheuvels and David M. Mason On the Fractal Nature of Empirical Increments, The Annals 
of Probability 23 (1995), 355-387. 

[DB97] Olivier Dubois and Yacine Boufkhad, A general upper bound for the satisfiability threshold of 
random r -SAT formulae, J. Algorithms 24 (1997), no. 2, 395-420. 

[FdlVK02] W. Fernandez de la Vega and Marek Karpinski, 9/ 8- approximation algorithm for random max- 
Ssat, Technical Report TR02-070, Electonic Colloquium on Computational Complexity (2002). 

[Fei02] Uriel Feige, Relations between average case complexity and approximation complexity, 34th An- 
nual ACM Symposium on Theory of Computing (Montreal, QC), 2002, pp. 534 - 543. 

[GJ79] Michael R. Carey and David S. Johnson, Computers and intractability. Freeman, San Francisco, 
CA, 1979. 

[GSCKOO] C. P. Gomes, B. Selman, N. Crato, and H. Kautz, Heavy-tailed phenomena in satisfiability 
and constraint satisfaction problems, J. Automat. Reason. 24 (2000), no. 1-2, 67-100. MR 
2000k:68070 

[Pol02] David Pollard, A user's guide to measure theoretic probability, Cambridge Series in Statistical 
and Probabilistic Mathematics, Cambridge University Press, Cambridge, 2002. MR 2002k:60003 



13 



[Red94] Darren Redfcrn, The Maple Handbook: Maple V Release 3, third ed., Springer Verlag, New York, 
1994. 

[SK93] Bart Sclman and Henry Kautz. Domain-independent extensions to GSAT: Solving large structured 
satisfiability problems, Proc. 13th International Joint Conference on Artificial Intelligence, 1993, 
pp. 290-295. 

[SLM92] Bart Sclman, Hector Lcvesquc, and D. Mitchell, A new method for solving hard satisfiability 
problems, Proc. 10th National Conference on Artificial Intelligence, 1992, pp. 440-446. 



14 



A Building up an arsenal 

In this section we collect some basic inequalities and identities that we will use in the proofs of Lemmas |S1 
151 and [TUI For readability, in this Appendix we have replaced q of Definition ^ with the letter y. 
Plugging in the definition of 770 from 112|l into the definition of / we get 

/(a,7o,??o) 

= % <^ (^1 - a + a j (a7o + 1 - a) + ^^j-^ ^ . 

For some parts of the ensuing calculations, it will be to convenient to use the following normalizations of 
/(a,7o,»7o) and (/^(a, 7o, '7o) denoted as /o and 50 respectively 

/o(a) = 22So'So'^'"V(«,7o,??o) and go{a) = 2'^^^l'^^rfj'^'"\r{ano,rio)- (33) 

We will also write £0 = 1 ^ 7o- With this notation, we have the following formula for /o, which holds for 
every x G [-1/2, 1/2] 

/o {\+A= [22^£o' + (2 - £0)']'= - 2eo(2 - eof-^[2 - eq + 2xeof + eo'(2 - eo)'*"'(l + '^xf , (34) 



2 

In particular, 

/o = (2 - eo)'' - 2£o(2 - eo)''-^ + eo'(2 - eo)''"' = 4(1 - eo)'(2 - £0)''^' ■ (35) 

The function y ^ 1— ty + ylny, defined on [0,1], appears throughout our analysis. The following 
inequalities, valid for all y E [0, 1], will be used 

^l^-^<l-y + ylny<{l-yf. (36) 



The right-hand inequality follows from the estimate Iny < y — 1. The left-hand inequality follows from 
integrating this estimate as follows 

/■I /■! (1-y)^ 
— l + y — y\ny= / \nxdx < / {1 — x)dx = . 

We end this section by providing some estimates for the values of £0 and r^o 
Fact 1. For all sufficiently large k , 

2(1 - v) 4fc(i-y)^ ^ ^ 2(1 -y) 

2^-fc-l 2^ - '° - 2^-fc-l ' ^^^^ 

and 

. r (fc + i)(i-y) , 4fc(i-zj)^ ) 

?/o < mm<^y,2/- ^-^ + \ . (38) 
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A.l Proof of Fact [U 

By the first equation in (|12|l we have that 770 = 1 — eo(2 — £o)'^~^- Plugging this into the second equation in 
^ we find that 



y 



l-eo(2-eo)^ 



1 -eo(2-eo) 



fe-i 



1 



2fe (2-eo)^-eo(2-eo)'-i 2(1 - £o)(2 - eo)'=-i 2 
Hence, if we denote 

(2 - - 1 



fc-i 



(39) 



^-(2-t)^- (l~0(2-t)'^-i ' 

then we require that V'(£o) = 1 — ^® collect below some useful properties of tp. 

Lemma 12. -0 is increasing on [0, 1]. Furthermore, for k large enough and every t <t < l/(2k), 

1 (fc + l)i t2 1 (k + l)t , 

1 - x.-^ + t - + - < ^(i) < 1 - 2fcrT + i - + 2t' 



(40) 



2^- 



2fe 



(41) 



Proof. The fact that ip is increasing follows immediately from the first formula in (|4U|I . To prove the 
inequalities in I0TJ, observe that 



(1-1 



k-1 



(1-0(1-1)'"' 1-^ 2'^-i(l-t)(l-|) 



fe-i 



To estimate ip from below, we use the inequalities 1 + a + < 1/(1 — a) < 1 + a + 2a} and (1 — a)^ ^ > 
1 — (/c — l)a, valid for all < a < 1/2, to show that whenever a < l/(2fc) 

1 



2'=-i(l-f) 1- 



> 



1 + ^ -^(1 + ^ + 2^ ) (^1 + 

1 (k + l)t e 

2*^-1 2*= 2 



(fc-1)^ , o (fc-l)^^^ 



for all k sufficiently large. 

The reverse inequality is just as simple 



k~l 



< 1 + 1 + 2r 



1 , , f (k- l)t\ 
— (1+i) (l + ^^j 



1 (k + l)t , 

- ^ 2^-1^ 2^ ^ 

□ 

We are now in position to conclude the proof of Fact^ Since tp is increasing and i^ieo) — 1 — 
inequalities in H37(l will be proved once we show that 



2(1 -y) 16(1 -y)= 



2'^ - A: - 1 



22k 



2(1-;/) 
2'' -k-i 



(42) 
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To prove the right-hand inequality in 1)42(1 . set t = 2i^^k^i ^-nd observe that for k large enough, t < l/{2k). 
Hence, by Lemma IT^ 



2k-i 2*= 2'=-! 2'' -k-1 2*= 2*= - fc - 1 2'^-i 



The left-hand inequality in 142|) is equally simple. In this case we apply Lemma |12I with t — — — 



"^^^^2fc^^ and get that 



1 ik + l)t ^7 



< 1 



2fc-i 2'=-A:-l 2^ 2'=-fc-l 22/^ (2*^ - fc - 1)2 

y 



2k-i ' 

as long as k is sufficiently large. 

To prove the estimate (|38f) observe that the function s i-^ s(2 — s)'''^^ is increasing on [0, 2/k]. Since we 
have shown that for sufficiently large k, Eq < ^k^Jj^-i ^ §; the lower bound in (|37|l yields 

7^0 = l-£o(2-eo)^'"' 

< 1 ofc-i p(i-i^) i6(i-y)^ W 1-y Ki~y? \'~' 

- l^2fc-A:-l 22/= 2^-fc-l 22^^ 

< 1 ( 2^(1-^^) m-y? ^ (fc-l)(l-y) 8(fc-l)(l-y) 

< V 



2k -k-1 2^ J \ 2'' -k-1 22fe 

(fc + l)(l-y) , 4fc(l-y)2 



2*^-/0-1 2*= 
provided /c is large enough. The inequality rjo < y is simpler. By (I39II . 

_y_ ^ 1 - £o(2 - gp)'^'-^ ^ rjo ^ 770 

2fc ~ 2(1 - £o)(2 - eo)^-i " 2fe(l - eo)(l - eo/2)^-i " 2^= ' 

B Proof of Lemma [SI 

Since the function a 1-^ a"(l — a)^^" is symmetric around 1/2, it suffices to prove that for every x E (0, 1/2], 

f (l+x,l,v] > f ■ (43) 



To this end, fix a; e [—1/2, 1/2] and 7, > 0. Denote e = 1 — 72. Plugging this notation and a = 1/2 -I- x 
into l|13|l . we find that the following identity holds 

^!//2'=-i22fe^2fc^ Q ^ ^ pxe^ + (2 - e)2]^ - 2(1 - ij)[i2 - e) + 2xe]'' + (1 - 77)^(1 + 2x)'' 

= ^ ('^')2Jx^ [e2J(2 - e)2('=-J) - 2(1 - J7)e^'(2 - e)^-^' + (1 - 7;)^] 
= Ef^')2V[e^(2-e)''-^-(l-,,)]2. (44) 



This shows that we can write /(1/2 + x, 7, ?/) = X]j=o '^i^;-' for some aj > 0, such that at most one of the 



Cj's is zero. Since for every a; > and odd j, — {—xy > 0, 143|1 follows. 
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C Proof of Lemma [HI 

In this section we will use the normalization (|33|l . To prove the first assertion of Lemma |51 our goal is to 



show that goia) <Ofori<Q;<l - Observe that 



5o(") 



/o(ar-i{r/^(a) + /o(a)[ln(l -a)- In a] } 



a"(l - ay 



(45) 



Differentiating H34|) at x = we find that 



Since, by 1)44(1 . /o(a) > it is enough to show that the following hmction is decreasing on [j, 1 — '^'^^ ] 



^(a) = rf'^{a) + /o(a)[ln(l - a) - In, 



Now, 



= + /^(«)[ln(l ~ a) - Ina] - I- + . 

\a i — a J 

Since for 1/2 < a < 1, ln(l — a) < Ina and, by ((44() . /q > on (1/2, 1], it is thus enough to prove that 

r/;(a) < /o(«) f- + . 

\a 1 — a / 

Now, ^ + > 4 and, from JHSJ, we get that for a > 1/2, 



/o(«) > /o 



4(l-eo)^(2-eor-'> (2-eo) 



\2k-2 



where we also used that, by 1(3 7() . Eq < 1/2 for fc large enough. Thus, it suffices to prove 

r/^'Q+a;)<4(2-£or-^ 
Now, using that x < | — '^'^^ , we differentiate (|34|) twice to get 



(46) 



/o' 



1 



4fc(fc - 1) {eo'[2xeo2 + (2 - eo)']'"' - 2£o='(2 - eo)'-M2 - eo + 2a;eo]'=-' + £o'(2 - eof^-\l + 2xf-^] 

2 \k-2 



<4fc^<^£o*(2-eo) 



1 



2a;£o 



+ eo'(2-£o)"="'(l + 2a;) 



(2-£o)^ 

< 4fc2 |eo4(2 - Eo)''-'' (1 + 2x)'-2 + £^\2 - eo)'^-'(l + 2.t)'^''2| 

< %k^eo^{2- eof''''^{l + 2xf 



< 8fc%^(2-£o)'' 



2- 



61nfc 
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where in the last hne we used the fact that for k large enough, 1)3 7|) implies £o < 4(1 — ?/)/2'^. 
Combining this estimate with (|46|l . we see that we must show that for sufficiently large k 



128 In 2 



1 — y + yhiy k 



< 4 



and this is indeed the case by 

It remains to show that (7o(l/2) < 0. Denoting ({a) = a~"{l — a)"~^, we see from that g'^ia) = 
/o(a)''"V(a)C(a). Since /o(l/2) = 0, C'(l/2) = 0, and we have just verified that V'(l/2) < 0, the required 
result follows. 



D Proof of Lemma 1101 



31nfc 



Our goal is to show that for any < r < i^, and 1 j, 

7(a, VTo, VVo] 



<a<l, 



< 2a"(l - ay 



/(l/2,7o,r;o) 

The following lemma gives an upper bound for the left-hand side of (|47|l . 
Lemma 13. For all sufficiently large k, 



(47) 



/(«, Vto, ^/^) ^ ,y/2'= 
/(l/2,7o,%) 



2(1 - y) - 2(1 - yy) + (1 - 7y)2a^- 60fc(l-^)- 



(48) 



Proof. Denote ei ~ 1 — {^/^Y ^ 1 ^ ^1 ^ ^o- For 1 — < a < 1 write a; = a — 1/2. Analogously to 
()44|l we have we have the following identity 

= [2.T£i2 + (2_£i)2]fc_2(l-0^)[(2-£i) + 2xei]'= + (l- V%)'(l + 2x)'=. (49) 

Our first goal is replace rjo in the right-hand side of H49|) by its upper bound from H38|l . To this end consider 
the function 

p{b) = [2xei^ + (2 - eiff - 26[(2 - ei) + 2xei]'' + 5^(1 + 2a;)^ (50) 

and observe the right-hand side of (|49|l equals p {l — \/Vo) ■ So, that it is enough to show that p is decreasing 
on [0, 1]. Since p is convex and quadratic, this would follow once we show that p'{l) < 0. This is equivalent 
to 1 -I- 2a; < 2 — El -I- 2a;ei, which is true since x < 1/2. Hence 

Vii^'"2''j^f (a, Vt^, V^) = P(l - Vv~o) 

< p(l - x/I) = [2xei^ + (2 - ei)^f - 2(1 - x/i)[(2 - si) + 2x6,]" + (1 - ^1)^(1 + 2x)^ (51) 
Where z is the upper bound for rjo from 1)38(1 . i.e., 

(fc + l)(l-zj) , 4fc(l-y)2- 



z = mm < y, y 



2*^-/1-1 



(52) 
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Hence using rjo < z and the identity H35|) wc bound the ratio in H47|) as foUows 

VTo, 0?o) 
/(l/2,7o,r;o) 



Vo ■ 7o ■ 



,^./2-^22'^7o''/(l/2,7o,%) 



< 



4(l-£o)2(2-£o)2'=-2 
4(l-eo)2(2-eo)''=-2 



{[2x£i2 + (2 - ei)2]fe - 2(1 - ^/i)[(2 - eO + 2x£i]'= + (1 - 71)2(1 + 2x)^} 



2(1 -£o) 



Vl - £o 
(l-eo/2)^ 
2(1 -Vi)[l-£i , (l-^/I)2a'=' 



2fe 



2fe 



(53) 



We wiU bound the various terms in (|53|l separately. First of all, using H52|) and the inequality e" < 
1 + a + , which is valid for < a < 1 , we get that 



< 



„a/2'= 



(fc+l)(l-y) , 4fc(l-y)2 



y"' cxp 



^(2*= - A: - 1) y2'= 
(fc + l)(l-y) , 4fc(l-y) 



2 1 



1 



2'=(2'^- - fc - 1) 22fc 
(fc + l)(l-y) , 8fc(l-y)2 



2'=(2'^--fc-l) 



22fc 



(54) 



as long as k is large enough. 

Next, using the inequality 1/(1 — a) < 1 + 2a, valid for < a < 1/2 we get 



1 



So 



2(1 -eo) 



< 



l + |(l + 2£o) 



< 1 



(55) 



Next, using the inequality y/1 — x < 1 — the inequality 1/(1 — a) < 1 + a + 2a^ , valid for < a < 1/2, 
and the inequality (1 + < 1 + ka + k^a^/2, which is valid for all a < 1/(4^^), we get that since for k 
large enough Sq < 1/(4^^), 



yi - gp 
L(i-go/2)2 

Hence, for k large enough 
1 



< 



(l-eo/2)'= - V ^ 2 ^ 2 ' - 



^ + + kel 



So 



2(1 -£o) 



(l-eo/2)2 



< 1+ 1 



(56) 



(57) 



Next, using the inequality x/2 < 1 — ^1 — x < x/2 + x^, which is valid for < x < 1/2, we get that 

^ <ei = l-VT^<^+el<eo. (58) 



Observe that since < .t < 1/2 and £i < 1/2, the function ei i-^ ^ + (l — is decreasing in ei. Hence, 
the lower bound in H58() . together with another application of the inequality (1 + a)*^ < 1 + ka + k'^a'^/2, 
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valid for all a < l/(4fc^), implies that for sufficiently large k 



r ^2 



< 



^ + 1 - — 
32 ' 



< 



1 £o £o 
2 10 



<l-^+'^ + kel. (59) 



The second term in the brackets of (|53|l appears with a minus sign, so we bound it from below, using the 
fact that z < y and £i < eo- 



2k 



> 



> 



> 



> 



where we have used the upper bound in H37|l . 

Combining lE3I), 113}, (EHll and we get 



2(1 




2(1 - Vl)fcei(l-a) 




2^ 


2fc 


2(1 




2(1 - 7y)fceo 




2^ 


2k 


2(1 




2{1 ^y)k 2{l~y) 




2^ 


2k 2'= - fc - 


2(1 




8fc(l-y)2 




2^ 





(60) 



/(«, Vto, Vv^) 

/(l/2,7o,%) 



< 



1 - 



(fc + l)(l-y) 8fc(l-y)^ 
2'=(2fc-fc-l) 











1 + 











^ + ksl 



ks„ eel ^ , 2 2(1 - yi) 8fc(l - yf (1 - Vi)^a'= 



(fc + l)(l-y) , 8fc(l-y)2 



1 + eo 



2'=(2'= - fc - 1) 

(1 - V^fo 

2k 2^ ^ ' 

(fc + l)(l-y) , 8k{l-yf ^ 
2^{2'' - fc - 1) 



22A; 

2(1 -V^) , 8fc(l-y)2 fc(l-yi)% 8fc2(l-y)% 



k^e^ 



22k 



22k ' 2k ' 2^'^ ' '" 

2(1 -y) , (1-Vi)'«'' 2(1 -yi) , 30fc(l-y)2 



2*^ - fc - 1 



2k 



2k 



22k 



< 1 + 



2(1 -y) (fc + l)(l_y) (1-VI) 



2^,fc 



2(1 -Vi) , 50fc(l-y)= 



2fc 



1 2'=(2'= 



1) 



2/c 



2/c 



22fc 



(61) 



where we have used the upper bound in H37() . H52|l and the fact that fc is large enough. 
Now, we claim that for every a e [0, 1], 



(1 - V^)2a'= - 2(1 - V^) < (1 - - 2(1 - V^) + z - y 



(62) 



Indeed, since by (|52|l . z < y, the left-hand side minus the right-hand side of H62|) is an increasing function, 
which vanishes at 1. Moreover, by (|52|l . 



y< 



(fc + l)(l-y) 4fc(l-y)^ 
2*^- - fc - 1 2'^ 



so that becomes 



(1 - V^)2a^- - 2(1 - V^) < (1 - Vy) 



2„ fe 



2(1 - Vy) 



(fc + l)(l-y) , 4fc(l-y)2 



2*= - fc - 1 



2k 
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Plugging this into we get 



fia,Vl^,Vm) ^ L 2(1 -y) 2(fc + l)(l-y) il~^)^a'^ 2(1 - yy) 60k{l - y f 

/(l/2,7o,7;o) \ 2''-k-l 2'^{2k-k-l) 2fc 2^ 22'= 



This concludes the proof of Lemma □ 



Denote h{a) ~ —a In a — (1 — a) ln(l — a). Taking logarithms of (|47(l . and using H48|l and the inequality 
ln(l + x) < X, we see that our goal is reduced to showing that for all r < tk, 



r 



2/ In 2/ + 2(1 - y) - 2(1 - Vy) + (1 - Vy) « + 

For simplicity denote: 



2 , , GOkil-yf 



< ln2 - h{a) . 



60fc(l-v)^ 
2^ 



(64) 



(65) 



With this notation becomes 



r In 2 — h(a) , , , , 



Aaf" + B 



and this should hold for aU a > 1 - We are therefore interested in the minimal value of M on the 

interval [l - l] . The derivative of M is 



M'{a) = 



{Aa^ +B) ■ [lna-ln(l - a)] - kAa^-^[\n2 - h{a)] 
[Aa^+BY 



(66) 



In particular, M'(l) = cx3, so that the minimum of M cannot occur at a = 1. We rule out the possibility of 
the minimum being at 1 — ^^^^ in the following claim. 

Claim 1. If k is large enough then for every 1 - ^ < a < 2-1/'=, M (a) > A/(l). 
Proof. Observe that for every (3 G [0, 1], 



h{P) = /31n(l//3)-(l-/3)ln(l-/3) 
1 



< /3 ( - _ 1 ) - (1 - /3) ln(l -[3) = l-P-{l-P) ln(l - (3) 



Hence, since a > 1 



31nfc 



h{a) <h[l 



31nfc 



3 In fc 3 In k , 



31nfc 



< 



4(lnfc)2 



(67) 



(68) 



Using the fact that a <2 ^Z'"', it follows that 

j^2_ 5(lnfc)f 



M [a) > 



> 



In2(l-Ml£a: 
'-+B 
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On the other hand, A/(l) = \n2/{A + B), so that it is enough to show that 



10(lnA:)2 
k 



> 



which is equivalent to 



A + B 



> 



A + B 
20(lnfc)^ 



1 



A + B 



(69) 



Observe that since 1 — > (1 — y)/2, ^ > (1 — J/)^/4. On the other hand, using Ip??)]! we get that for 
sufficiently large k, 



A + B = 1 — y + ylny 



60/fc(l-y)2 



<{i-yf + 



60fc(l - 



<2{l~yr , 



It follows that the left-hand side in (|69|l is at least 1/8, so that H69|) provided k large enough. □ 
By Claimnit remains to bound M{a) from below when a > 2^^/'"' and AI'{a) = 0. In this case, by H66|l . 

kAa''-^ 



Infl — a) = — In a + 
^ ' Aa^ + B 



[\n2 -h{a)] 



(70) 



From the lower bound a > 2 and (|67|l it follows that h{a) < "^'^^ . Hence (|7U|I . together with our 
assumption that k is large, implies 



ln(l - a) > 



kA/2 
A + B 



In 2 



4 In A: 



> 



A 



5 A + B 



(71) 



As we have seen in the proof of Claim^ ^ > (1 — y)^/4 and A + B < 2(1 — y)^. Plugging these inequalities 
into l(7T)) . we get that — ln(l — a) > fc/40, i.e. a > 1 — e~'''/^°. Plugging this into (|7n|l once more, we get that 



ln(l - .) > fe^[l-2fcy^")][i,2- 2/(e^V40)] > fcAln2 



Finally, we have shown that 



A + B 



a > 1 — exp 



A + B 



6 



ofe/40 



fcAln2 
A + B 



ofe/40 



(72) 



We are now ready to bound M{a) from below. We start by recalling that 

mk(l-yf 

A + B = l-y + y\nyA ^ = I - y + ylny + P 



Using the inequality 1/(1 + x) > 1 — x, we get 



> 



A + B 1— y + j/lny 



P 



> 



l-y + yXny 



120fc 



(73) 



where the last inequality used H36(l . Of course, we also know that A + B > 1 — y + ylny. 
Now, using (EZ|), (1721), GSJ, and the fact that (1 - ^/yf /{I - y + yhiy) < 1, we get 



h{a) < 



2kAlii2 



A + B 

< 2k In 2 exp 

< 2k In 2 exp 



1 



ofc/40 



exp 



/cylln2 



A + B 



oA;/40 



fc(l-Vy)^ln2 / 

1 - y + y In y V 
fc(l- Vy)^ln2 / 120fc 

1 — y + y In y \ 2'^ 



120A:\ / 



ofc/40 



= fe/40 



< 10fcln2 • 2-'='^^«\ 
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where as in Proposition [7| (p{y) = 
So, using (|75|l . we get 

M{a) 



> 



> 



> 



> 



In 2 - h{a) 
In 2 



A 



B 
In 2 



(1 - l{)k2-^'^^y^) 
120fc 



1 — y + y In y 
In 2 



1 - 



1 



y + y\ny 
In 2 



1 - 



1 



2k 
120fc 



(1 - 10A:2-'^'^(^)) 
- 10A:2-'='^(^) 



\-y + y In y 

where we have used the fact that ^p{y) > 1/2 and that k is sufSciently large. 
This concludes the proof of Lemma ^| 
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